Overview

Dataset statistics

Number of variables19
Number of observations40395
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory14.9 MiB
Average record size in memory387.5 B

Variable types

NUM8
BOOL6
CAT5

Warnings

city_name has a high cardinality: 1046 distinct values High cardinality
region is highly correlated with provinceHigh correlation
province is highly correlated with regionHigh correlation
surface_of_the_land is highly skewed (γ1 = 53.15034165) Skewed
df_index has unique values Unique
surface_of_the_land has 20751 (51.4%) zeros Zeros

Reproduction

Analysis started2020-09-18 15:33:33.547087
Analysis finished2020-09-18 15:33:48.307727
Duration14.76 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct40395
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25808.86065
Minimum0
Maximum52075
Zeros1
Zeros (%)< 0.1%
Memory size315.7 KiB
2020-09-18T17:33:48.411763image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2572.7
Q112715.5
median25666
Q338836
95-th percentile49437.3
Maximum52075
Range52075
Interquartile range (IQR)26120.5

Descriptive statistics

Standard deviation15083.3038
Coefficient of variation (CV)0.5844234663
Kurtosis-1.208587665
Mean25808.86065
Median Absolute Deviation (MAD)13087
Skewness0.01817640765
Sum1042548926
Variance227506053.6
MonotocityStrictly increasing
2020-09-18T17:33:48.547376image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
20471< 0.1%
 
117271< 0.1%
 
219841< 0.1%
 
424621< 0.1%
 
486051< 0.1%
 
363151< 0.1%
 
342661< 0.1%
 
404091< 0.1%
 
383601< 0.1%
 
96781< 0.1%
 
199391< 0.1%
 
158211< 0.1%
 
137721< 0.1%
 
35311< 0.1%
 
14821< 0.1%
 
76251< 0.1%
 
55761< 0.1%
 
260541< 0.1%
 
240331< 0.1%
 
322291< 0.1%
 
261181< 0.1%
 
342981< 0.1%
 
302121< 0.1%
 
179221< 0.1%
 
240651< 0.1%
 
Other values (40370)4037099.9%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
31< 0.1%
 
41< 0.1%
 
51< 0.1%
 
61< 0.1%
 
71< 0.1%
 
81< 0.1%
 
91< 0.1%
 
111< 0.1%
 
ValueCountFrequency (%) 
520751< 0.1%
 
520731< 0.1%
 
520721< 0.1%
 
520711< 0.1%
 
520701< 0.1%
 
520681< 0.1%
 
520671< 0.1%
 
520651< 0.1%
 
520641< 0.1%
 
520631< 0.1%
 

postal_code
Real number (ℝ≥0)

Distinct1057
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5195.044139
Minimum1000
Maximum9992
Zeros0
Zeros (%)0.0%
Memory size315.7 KiB
2020-09-18T17:33:48.681304image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile1080
Q12360
median4630
Q38400
95-th percentile9420
Maximum9992
Range8992
Interquartile range (IQR)6040

Descriptive statistics

Standard deviation2979.185308
Coefficient of variation (CV)0.5734667942
Kurtosis-1.517977446
Mean5195.044139
Median Absolute Deviation (MAD)2845
Skewness0.08975300772
Sum209853808
Variance8875545.1
MonotocityNot monotonic
2020-09-18T17:33:48.808875image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
83007501.9%
 
84006851.7%
 
90006341.6%
 
11804981.2%
 
10004511.1%
 
83704471.1%
 
40003941.0%
 
86703650.9%
 
20003230.8%
 
10503230.8%
 
10703010.7%
 
10303000.7%
 
93002890.7%
 
86202850.7%
 
35002790.7%
 
10802690.7%
 
80002670.7%
 
23002590.6%
 
21002580.6%
 
20182480.6%
 
88002350.6%
 
86602320.6%
 
91002190.5%
 
40202180.5%
 
84302170.5%
 
Other values (1032)3164978.3%
 
ValueCountFrequency (%) 
10004511.1%
 
10201230.3%
 
10303000.7%
 
10401450.4%
 
10503230.8%
 
10601290.3%
 
10703010.7%
 
10802690.7%
 
1081640.2%
 
1082570.1%
 
ValueCountFrequency (%) 
99925< 0.1%
 
999114< 0.1%
 
9990490.1%
 
998810< 0.1%
 
99822< 0.1%
 
99813< 0.1%
 
99804< 0.1%
 
99713< 0.1%
 
99706< 0.1%
 
996815< 0.1%
 

city_name
Categorical

HIGH CARDINALITY

Distinct1046
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size315.7 KiB
Antwerpen
 
886
Knokke
 
750
Oostende
 
685
Gent
 
634
Uccle
 
498
Other values (1041)
36942 
ValueCountFrequency (%) 
Antwerpen8862.2%
 
Knokke7501.9%
 
Oostende6851.7%
 
Gent6341.6%
 
Uccle4981.2%
 
Bruxelles4511.1%
 
Uitkerke4471.1%
 
Glain3941.0%
 
Wulpen3650.9%
 
Ixelles3230.8%
 
Deurne3090.8%
 
Anderlecht3010.7%
 
Schaerbeek3000.7%
 
Aalst2890.7%
 
Nieuwpoort2850.7%
 
Hasselt2790.7%
 
Molenbeek-Saint-Jean2690.7%
 
Brugge2670.7%
 
Turnhout2590.6%
 
Beveren2580.6%
 
De Panne2320.6%
 
Nieuwkerken-Waas2190.5%
 
Liège2180.5%
 
Middelkerke2170.5%
 
Renaix2130.5%
 
Other values (1021)3104776.9%
 
2020-09-18T17:33:48.956569image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique84 ?
Unique (%)0.2%
2020-09-18T17:33:49.088966image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length30
Median length8
Mean length8.565614556
Min length2

Overview of Unicode Properties

Unique unicode characters62
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e6391118.5%
 
n266667.7%
 
r238966.9%
 
a187035.4%
 
l186555.4%
 
o164024.7%
 
i163994.7%
 
t159784.6%
 
s148114.3%
 
u111323.2%
 
k88922.6%
 
-84182.4%
 
m75372.2%
 
g64891.9%
 
B60271.7%
 
d59651.7%
 
h51981.5%
 
b46661.3%
 
c44501.3%
 
p43921.3%
 
L40051.2%
 
S37981.1%
 
A37961.1%
 
M37061.1%
 
G34261.0%
 
Other values (37)3869011.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter28741783.1%
 
Uppercase Letter4929414.2%
 
Dash Punctuation84182.4%
 
Space Separator5170.1%
 
Other Punctuation3620.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B602712.2%
 
L40058.1%
 
S37987.7%
 
A37967.7%
 
M37067.5%
 
G34267.0%
 
H33436.8%
 
W28925.9%
 
E19534.0%
 
K19524.0%
 
O18163.7%
 
D17233.5%
 
N14893.0%
 
T13412.7%
 
P12522.5%
 
R11322.3%
 
C10232.1%
 
U9451.9%
 
J9441.9%
 
F8751.8%
 
Z6421.3%
 
I6091.2%
 
V5631.1%
 
Q330.1%
 
À6< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e6391122.2%
 
n266669.3%
 
r238968.3%
 
a187036.5%
 
l186556.5%
 
o164025.7%
 
i163995.7%
 
t159785.6%
 
s148115.2%
 
u111323.9%
 
k88923.1%
 
m75372.6%
 
g64892.3%
 
d59652.1%
 
h51981.8%
 
b46661.6%
 
c44501.5%
 
p43921.5%
 
v31851.1%
 
w26010.9%
 
x17880.6%
 
z13730.5%
 
j10360.4%
 
f8350.3%
 
y6820.2%
 
Other values (8)17750.6%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-8418100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
'362100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
517100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin33671197.3%
 
Common92972.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e6391119.0%
 
n266667.9%
 
r238967.1%
 
a187035.6%
 
l186555.5%
 
o164024.9%
 
i163994.9%
 
t159784.7%
 
s148114.4%
 
u111323.3%
 
k88922.6%
 
m75372.2%
 
g64891.9%
 
B60271.8%
 
d59651.8%
 
h51981.5%
 
b46661.4%
 
c44501.3%
 
p43921.3%
 
L40051.2%
 
S37981.1%
 
A37961.1%
 
M37061.1%
 
G34261.0%
 
H33431.0%
 
Other values (34)3446810.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
-841890.5%
 
5175.6%
 
'3623.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII34452399.6%
 
None14850.4%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e6391118.6%
 
n266667.7%
 
r238966.9%
 
a187035.4%
 
l186555.4%
 
o164024.8%
 
i163994.8%
 
t159784.6%
 
s148114.3%
 
u111323.2%
 
k88922.6%
 
-84182.4%
 
m75372.2%
 
g64891.9%
 
B60271.7%
 
d59651.7%
 
h51981.5%
 
b46661.4%
 
c44501.3%
 
p43921.3%
 
L40051.2%
 
S37981.1%
 
A37961.1%
 
M37061.1%
 
G34261.0%
 
Other values (29)3720510.8%
 

Most frequent None characters

ValueCountFrequency (%) 
é68145.9%
 
è54937.0%
 
ê896.0%
 
â745.0%
 
ô674.5%
 
ë100.7%
 
à90.6%
 
À60.4%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size315.7 KiB
0
21440 
1
18955 
ValueCountFrequency (%) 
02144053.1%
 
11895546.9%
 
2020-09-18T17:33:49.174547image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

price
Real number (ℝ≥0)

Distinct3517
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean314114.6616
Minimum2500
Maximum950000
Zeros0
Zeros (%)0.0%
Memory size315.7 KiB
2020-09-18T17:33:49.261577image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum2500
5-th percentile120000
Q1199000
median275000
Q3379000
95-th percentile680000
Maximum950000
Range947500
Interquartile range (IQR)180000

Descriptive statistics

Standard deviation168151.6724
Coefficient of variation (CV)0.5353194006
Kurtosis1.949629927
Mean314114.6616
Median Absolute Deviation (MAD)85000
Skewness1.370488373
Sum1.268866176e+10
Variance2.827498492e+10
MonotocityNot monotonic
2020-09-18T17:33:49.394804image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2490005561.4%
 
1990005511.4%
 
2990005451.3%
 
2250005211.3%
 
2950005201.3%
 
2750005171.3%
 
3250004281.1%
 
1750004171.0%
 
2350004151.0%
 
1950004111.0%
 
3950004091.0%
 
1850004021.0%
 
2650003991.0%
 
2450003891.0%
 
2500003871.0%
 
2850003710.9%
 
3490003690.9%
 
2150003540.9%
 
1650003380.8%
 
2390003350.8%
 
3500003270.8%
 
2690003260.8%
 
2200003140.8%
 
1790003130.8%
 
2290003120.8%
 
Other values (3492)3016974.7%
 
ValueCountFrequency (%) 
25003< 0.1%
 
66001< 0.1%
 
81601< 0.1%
 
99991< 0.1%
 
100004< 0.1%
 
118251< 0.1%
 
125001< 0.1%
 
145001< 0.1%
 
150006< 0.1%
 
190001< 0.1%
 
ValueCountFrequency (%) 
950000700.2%
 
9490008< 0.1%
 
9480002< 0.1%
 
9470003< 0.1%
 
945000320.1%
 
9400007< 0.1%
 
9390001< 0.1%
 
9360001< 0.1%
 
9350003< 0.1%
 
9300009< 0.1%
 

number_of_rooms
Real number (ℝ≥0)

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.813838346
Minimum1
Maximum18
Zeros0
Zeros (%)0.0%
Memory size315.7 KiB
2020-09-18T17:33:49.504984image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q33
95-th percentile5
Maximum18
Range17
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.26096777
Coefficient of variation (CV)0.4481308502
Kurtosis6.883403348
Mean2.813838346
Median Absolute Deviation (MAD)1
Skewness1.578245584
Sum113665
Variance1.590039718
MonotocityNot monotonic
2020-09-18T17:33:49.601583image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%) 
21384734.3%
 
31336733.1%
 
4574714.2%
 
1414510.3%
 
520355.0%
 
67892.0%
 
72280.6%
 
81040.3%
 
9440.1%
 
10420.1%
 
1120< 0.1%
 
1212< 0.1%
 
134< 0.1%
 
154< 0.1%
 
163< 0.1%
 
143< 0.1%
 
181< 0.1%
 
ValueCountFrequency (%) 
1414510.3%
 
21384734.3%
 
31336733.1%
 
4574714.2%
 
520355.0%
 
67892.0%
 
72280.6%
 
81040.3%
 
9440.1%
 
10420.1%
 
ValueCountFrequency (%) 
181< 0.1%
 
163< 0.1%
 
154< 0.1%
 
143< 0.1%
 
134< 0.1%
 
1212< 0.1%
 
1120< 0.1%
 
10420.1%
 
9440.1%
 
81040.3%
 

house_area
Real number (ℝ≥0)

Distinct657
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean152.4663201
Minimum5
Maximum3560
Zeros0
Zeros (%)0.0%
Memory size315.7 KiB
2020-09-18T17:33:49.726281image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile60
Q192
median130
Q3184
95-th percentile324
Maximum3560
Range3555
Interquartile range (IQR)92

Descriptive statistics

Standard deviation95.64920638
Coefficient of variation (CV)0.6273464614
Kurtosis60.1319554
Mean152.4663201
Median Absolute Deviation (MAD)42
Skewness4.041212374
Sum6158877
Variance9148.770682
MonotocityNot monotonic
2020-09-18T17:33:49.858415image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
908922.2%
 
1208902.2%
 
1008762.2%
 
1508122.0%
 
1407461.8%
 
807331.8%
 
1107001.7%
 
1606851.7%
 
2006831.7%
 
1306571.6%
 
856481.6%
 
1805671.4%
 
705501.4%
 
755271.3%
 
955171.3%
 
1254501.1%
 
1704461.1%
 
1154341.1%
 
1054091.0%
 
1353770.9%
 
2203430.8%
 
1453420.8%
 
603410.8%
 
2503270.8%
 
653180.8%
 
Other values (632)2612564.7%
 
ValueCountFrequency (%) 
53< 0.1%
 
111< 0.1%
 
132< 0.1%
 
142< 0.1%
 
152< 0.1%
 
165< 0.1%
 
176< 0.1%
 
18220.1%
 
192< 0.1%
 
209< 0.1%
 
ValueCountFrequency (%) 
35601< 0.1%
 
20191< 0.1%
 
17001< 0.1%
 
16401< 0.1%
 
15002< 0.1%
 
14611< 0.1%
 
13501< 0.1%
 
13391< 0.1%
 
12002< 0.1%
 
11211< 0.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size315.7 KiB
1
28176 
0
12219 
ValueCountFrequency (%) 
12817669.8%
 
01221930.2%
 
2020-09-18T17:33:49.953935image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

open_fire
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size315.7 KiB
0
38230 
1
 
2165
ValueCountFrequency (%) 
03823094.6%
 
121655.4%
 
2020-09-18T17:33:49.995181image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

terrace
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size315.7 KiB
1
25052 
0
15343 
ValueCountFrequency (%) 
12505262.0%
 
01534338.0%
 
2020-09-18T17:33:50.036326image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

garden
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size315.7 KiB
0
27419 
1
12976 
ValueCountFrequency (%) 
02741967.9%
 
11297632.1%
 
2020-09-18T17:33:50.076337image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

surface_of_the_land
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct2952
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean545.8400792
Minimum0
Maximum400000
Zeros20751
Zeros (%)51.4%
Memory size315.7 KiB
2020-09-18T17:33:50.165047image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3416
95-th percentile1840
Maximum400000
Range400000
Interquartile range (IQR)416

Descriptive statistics

Standard deviation3609.242736
Coefficient of variation (CV)6.612271383
Kurtosis4663.468703
Mean545.8400792
Median Absolute Deviation (MAD)0
Skewness53.15034165
Sum22049210
Variance13026633.12
MonotocityNot monotonic
2020-09-18T17:33:50.305147image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
02075151.4%
 
1501690.4%
 
2001600.4%
 
10001450.4%
 
3001440.4%
 
2501420.4%
 
1001380.3%
 
1201290.3%
 
4001200.3%
 
6001150.3%
 
1801110.3%
 
1301100.3%
 
2101080.3%
 
1401040.3%
 
901020.3%
 
1601010.3%
 
110970.2%
 
70960.2%
 
170950.2%
 
80920.2%
 
500920.2%
 
800900.2%
 
60840.2%
 
220820.2%
 
240800.2%
 
Other values (2927)1693841.9%
 
ValueCountFrequency (%) 
02075151.4%
 
119< 0.1%
 
21< 0.1%
 
41< 0.1%
 
52< 0.1%
 
61< 0.1%
 
72< 0.1%
 
82< 0.1%
 
103< 0.1%
 
123< 0.1%
 
ValueCountFrequency (%) 
4000001< 0.1%
 
2647811< 0.1%
 
1203001< 0.1%
 
1200002< 0.1%
 
1178001< 0.1%
 
991481< 0.1%
 
988221< 0.1%
 
888001< 0.1%
 
876001< 0.1%
 
864352< 0.1%
 
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size315.7 KiB
2
14531 
0
10360 
4
8104 
3
7400 
ValueCountFrequency (%) 
21453136.0%
 
01036025.6%
 
4810420.1%
 
3740018.3%
 
2020-09-18T17:33:50.448162image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-18T17:33:50.524720image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:50.616269image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
21453136.0%
 
01036025.6%
 
4810420.1%
 
3740018.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number40395100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
21453136.0%
 
01036025.6%
 
4810420.1%
 
3740018.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common40395100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
21453136.0%
 
01036025.6%
 
4810420.1%
 
3740018.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII40395100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
21453136.0%
 
01036025.6%
 
4810420.1%
 
3740018.3%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size315.7 KiB
0
39699 
1
 
696
ValueCountFrequency (%) 
03969998.3%
 
16961.7%
 
2020-09-18T17:33:50.685744image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size315.7 KiB
as new
12096 
good
10985 
unknown
9796 
to be done up
2789 
to renovate
2441 
Other values (2)
2288 
ValueCountFrequency (%) 
as new1209629.9%
 
good1098527.2%
 
unknown979624.3%
 
to be done up27896.9%
 
to renovate24416.0%
 
just renovated21475.3%
 
to restore1410.3%
 
2020-09-18T17:33:50.766135image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-18T17:33:50.843725image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:51.146824image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length14
Median length6
Mean length6.923233073
Min length4

Overview of Unicode Properties

Unique unicode characters17
Unique unicode categories2 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n4886117.5%
 
o4465516.0%
 
e271329.7%
 
251929.0%
 
w218927.8%
 
a166846.0%
 
d159215.7%
 
u147325.3%
 
s143845.1%
 
t122474.4%
 
g109853.9%
 
k97963.5%
 
r48701.7%
 
v45881.6%
 
b27891.0%
 
p27891.0%
 
j21470.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter25447291.0%
 
Space Separator251929.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n4886119.2%
 
o4465517.5%
 
e2713210.7%
 
w218928.6%
 
a166846.6%
 
d159216.3%
 
u147325.8%
 
s143845.7%
 
t122474.8%
 
g109854.3%
 
k97963.8%
 
r48701.9%
 
v45881.8%
 
b27891.1%
 
p27891.1%
 
j21470.8%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
25192100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin25447291.0%
 
Common251929.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n4886119.2%
 
o4465517.5%
 
e2713210.7%
 
w218928.6%
 
a166846.6%
 
d159216.3%
 
u147325.8%
 
s143845.7%
 
t122474.8%
 
g109854.3%
 
k97963.8%
 
r48701.9%
 
v45881.8%
 
b27891.1%
 
p27891.1%
 
j21470.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
25192100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII279664100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n4886117.5%
 
o4465516.0%
 
e271329.7%
 
251929.0%
 
w218927.8%
 
a166846.0%
 
d159215.7%
 
u147325.3%
 
s143845.1%
 
t122474.4%
 
g109853.9%
 
k97963.5%
 
r48701.7%
 
v45881.6%
 
b27891.0%
 
p27891.0%
 
j21470.8%
 

lattitude
Real number (ℝ≥0)

Distinct1051
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.313450246
Minimum2.580669689
Maximum6.3009381
Zeros0
Zeros (%)0.0%
Memory size315.7 KiB
2020-09-18T17:33:51.264176image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum2.580669689
5-th percentile2.9203275
Q13.7141549
median4.361194615
Q34.849314652
95-th percentile5.622980506
Maximum6.3009381
Range3.720268411
Interquartile range (IQR)1.135159752

Descriptive statistics

Standard deviation0.8119890403
Coefficient of variation (CV)0.1882458343
Kurtosis-0.6577956038
Mean4.313450246
Median Absolute Deviation (MAD)0.5635523147
Skewness-0.07418063984
Sum174241.8227
Variance0.6593262016
MonotocityNot monotonic
2020-09-18T17:33:51.404855image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4.39970818862.2%
 
3.3233738617501.9%
 
2.92032756851.7%
 
3.71415496341.6%
 
4.33723484981.2%
 
4.3516974511.1%
 
3.140486814471.1%
 
5.5418643941.0%
 
2.7073119163650.9%
 
4.38157073230.8%
 
4.31234013010.7%
 
4.37371213000.7%
 
4.039642422890.7%
 
2.728398652850.7%
 
5.3368383972790.7%
 
4.32277792690.7%
 
3.20736112670.7%
 
4.9484612590.6%
 
4.4695254092580.6%
 
3.144162350.6%
 
2.5806696892320.6%
 
4.17802792190.5%
 
5.57342032180.5%
 
2.8063401232170.5%
 
3.60204652130.5%
 
Other values (1026)3112177.0%
 
ValueCountFrequency (%) 
2.5806696892320.6%
 
2.62625887< 0.1%
 
2.64344877420.1%
 
2.6449117152< 0.1%
 
2.6733217< 0.1%
 
2.7073119163650.9%
 
2.722259264800.2%
 
2.722568815< 0.1%
 
2.728398652850.7%
 
2.7409610513< 0.1%
 
ValueCountFrequency (%) 
6.30093812< 0.1%
 
6.26424981< 0.1%
 
6.2578278< 0.1%
 
6.20535735< 0.1%
 
6.18849323< 0.1%
 
6.16514841< 0.1%
 
6.12589538< 0.1%
 
6.12167935812< 0.1%
 
6.1117543280.1%
 
6.11084045< 0.1%
 

longitude
Real number (ℝ≥0)

Distinct1051
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.85386439
Minimum49.5085018
Maximum51.4743516
Zeros0
Zeros (%)0.0%
Memory size315.7 KiB
2020-09-18T17:33:51.551004image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum49.5085018
5-th percentile50.3102184
Q150.6701887
median50.8704524
Q351.1044854
95-th percentile51.2996935
Maximum51.4743516
Range1.9658498
Interquartile range (IQR)0.4342967

Descriptive statistics

Standard deviation0.3251105424
Coefficient of variation (CV)0.006393035148
Kurtosis1.466404874
Mean50.85386439
Median Absolute Deviation (MAD)0.2222474
Skewness-0.9593327094
Sum2054241.852
Variance0.1056968648
MonotocityNot monotonic
2020-09-18T17:33:51.694835image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
51.22110978862.2%
 
51.349429657501.9%
 
51.23031776851.7%
 
51.03971296341.6%
 
50.80182014981.2%
 
50.84655734511.1%
 
51.29969354471.1%
 
50.6482053941.0%
 
51.097791753650.9%
 
50.82228543230.8%
 
50.83814113010.7%
 
50.86760413000.7%
 
50.94297552890.7%
 
51.144160052850.7%
 
50.9303582790.7%
 
50.85435512690.7%
 
51.21470832670.7%
 
51.32338122590.6%
 
51.21152842580.6%
 
50.96873122350.6%
 
51.094377752320.6%
 
51.19339082190.5%
 
50.64513812180.5%
 
51.183316952170.5%
 
50.74761922130.5%
 
Other values (1026)3112177.0%
 
ValueCountFrequency (%) 
49.508501810< 0.1%
 
49.557756211< 0.1%
 
49.558079414< 0.1%
 
49.55819252< 0.1%
 
49.564206516< 0.1%
 
49.567529618< 0.1%
 
49.57494796< 0.1%
 
49.5814209320.1%
 
49.590312513< 0.1%
 
49.5969055240.1%
 
ValueCountFrequency (%) 
51.4743516340.1%
 
51.467795710< 0.1%
 
51.460924956< 0.1%
 
51.45063155290.1%
 
51.431558254< 0.1%
 
51.412750434< 0.1%
 
51.41206885210.1%
 
51.3994474230.1%
 
51.39753661< 0.1%
 
51.3957408519< 0.1%
 

province
Categorical

HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size315.7 KiB
Flandre-Occidentale
7235 
Anvers
5313 
Flandre-Orientale
5102 
Hainaut
4115 
Liège
3936 
Other values (6)
14694 
ValueCountFrequency (%) 
Flandre-Occidentale723517.9%
 
Anvers531313.2%
 
Flandre-Orientale510212.6%
 
Hainaut411510.2%
 
Liège39369.7%
 
Bruxelles-Capitale38369.5%
 
Brabant flamand37959.4%
 
Limbourg25936.4%
 
Brabant wallon17594.4%
 
Namur15643.9%
 
Luxembourg11472.8%
 
2020-09-18T17:33:51.841118image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-18T17:33:51.958268image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length19
Median length14
Mean length12.25881916
Min length5

Overview of Unicode Properties

Unique unicode characters31
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
a6259712.6%
 
e5891511.9%
 
n452109.1%
 
l434958.8%
 
r374467.6%
 
i268175.4%
 
t258425.2%
 
d233674.7%
 
-161733.3%
 
c144702.9%
 
u144022.9%
 
F123372.5%
 
O123372.5%
 
B93901.9%
 
b92941.9%
 
s91491.8%
 
m90991.8%
 
L76761.6%
 
g76761.6%
 
55541.1%
 
o54991.1%
 
A53131.1%
 
v53131.1%
 
x49831.0%
 
H41150.8%
 
Other values (6)187263.8%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter41690084.2%
 
Uppercase Letter5656811.4%
 
Dash Punctuation161733.3%
 
Space Separator55541.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
F1233721.8%
 
O1233721.8%
 
B939016.6%
 
L767613.6%
 
A53139.4%
 
H41157.3%
 
C38366.8%
 
N15642.8%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
a6259715.0%
 
e5891514.1%
 
n4521010.8%
 
l4349510.4%
 
r374469.0%
 
i268176.4%
 
t258426.2%
 
d233675.6%
 
c144703.5%
 
u144023.5%
 
b92942.2%
 
s91492.2%
 
m90992.2%
 
g76761.8%
 
o54991.3%
 
v53131.3%
 
x49831.2%
 
è39360.9%
 
p38360.9%
 
f37950.9%
 
w17590.4%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-16173100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
5554100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin47346895.6%
 
Common217274.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
a6259713.2%
 
e5891512.4%
 
n452109.5%
 
l434959.2%
 
r374467.9%
 
i268175.7%
 
t258425.5%
 
d233674.9%
 
c144703.1%
 
u144023.0%
 
F123372.6%
 
O123372.6%
 
B93902.0%
 
b92942.0%
 
s91491.9%
 
m90991.9%
 
L76761.6%
 
g76761.6%
 
o54991.2%
 
A53131.1%
 
v53131.1%
 
x49831.1%
 
H41150.9%
 
è39360.8%
 
C38360.8%
 
Other values (4)109542.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
-1617374.4%
 
555425.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII49125999.2%
 
None39360.8%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
a6259712.7%
 
e5891512.0%
 
n452109.2%
 
l434958.9%
 
r374467.6%
 
i268175.5%
 
t258425.3%
 
d233674.8%
 
-161733.3%
 
c144702.9%
 
u144022.9%
 
F123372.5%
 
O123372.5%
 
B93901.9%
 
b92941.9%
 
s91491.9%
 
m90991.9%
 
L76761.6%
 
g76761.6%
 
55541.1%
 
o54991.1%
 
A53131.1%
 
v53131.1%
 
x49831.0%
 
H41150.8%
 
Other values (5)147903.0%
 

Most frequent None characters

ValueCountFrequency (%) 
è3936100.0%
 

region
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size315.7 KiB
Flandre
24038 
Wallonie
12521 
Bruxelles
3836 
ValueCountFrequency (%) 
Flandre2403859.5%
 
Wallonie1252131.0%
 
Bruxelles38369.5%
 
2020-09-18T17:33:52.072810image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-18T17:33:52.149546image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:52.241240image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length9
Median length7
Mean length7.4998886
Min length7

Overview of Unicode Properties

Unique unicode characters14
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
l5675218.7%
 
e4423114.6%
 
a3655912.1%
 
n3655912.1%
 
r278749.2%
 
F240387.9%
 
d240387.9%
 
W125214.1%
 
o125214.1%
 
i125214.1%
 
B38361.3%
 
u38361.3%
 
x38361.3%
 
s38361.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter26256386.7%
 
Uppercase Letter4039513.3%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
F2403859.5%
 
W1252131.0%
 
B38369.5%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
l5675221.6%
 
e4423116.8%
 
a3655913.9%
 
n3655913.9%
 
r2787410.6%
 
d240389.2%
 
o125214.8%
 
i125214.8%
 
u38361.5%
 
x38361.5%
 
s38361.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin302958100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
l5675218.7%
 
e4423114.6%
 
a3655912.1%
 
n3655912.1%
 
r278749.2%
 
F240387.9%
 
d240387.9%
 
W125214.1%
 
o125214.1%
 
i125214.1%
 
B38361.3%
 
u38361.3%
 
x38361.3%
 
s38361.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII302958100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
l5675218.7%
 
e4423114.6%
 
a3655912.1%
 
n3655912.1%
 
r278749.2%
 
F240387.9%
 
d240387.9%
 
W125214.1%
 
o125214.1%
 
i125214.1%
 
B38361.3%
 
u38361.3%
 
x38361.3%
 
s38361.3%
 

Interactions

2020-09-18T17:33:39.136234image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:39.281301image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:39.396117image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:39.510120image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:39.632075image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:39.754745image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:39.882067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:40.003046image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:40.130004image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:40.241777image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:40.350470image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:40.461277image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:40.574298image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:40.700416image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:40.826253image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:40.947725image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:41.070837image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:41.181521image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:41.290673image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:41.399935image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:41.514627image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:41.632968image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:41.754209image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:41.869897image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:41.990945image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:42.112008image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:42.231492image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:42.353000image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:42.475757image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:42.604309image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:42.740451image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:42.866070image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:42.998316image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:43.232735image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:43.354998image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:43.476482image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:43.605130image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:43.731828image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:43.866350image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:43.994866image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:44.129367image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:44.258908image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:44.385155image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:44.513260image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:44.649743image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:44.785190image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:44.926742image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:45.062040image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:45.202705image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:45.327202image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:45.450427image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:45.575940image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:45.704027image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:45.836875image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:45.972728image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:46.103134image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:46.239916image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:46.373042image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:46.501331image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:46.630954image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:46.767359image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:46.903348image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:47.038918image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:47.167221image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-09-18T17:33:52.361856image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-09-18T17:33:52.597609image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-09-18T17:33:52.818626image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-09-18T17:33:53.050755image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-09-18T17:33:53.261172image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-09-18T17:33:47.599593image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-18T17:33:48.016555image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

df_indexpostal_codecity_nametype_of_propertypricenumber_of_roomshouse_areafully_equipped_kitchenopen_fireterracegardensurface_of_the_landnumber_of_facadesswimming_poolstate_of_the_buildinglattitudelongitudeprovinceregion
001050Ixelles0340000620310109520to be done up4.38157150.822285Bruxelles-CapitaleBruxelles
111050Ixelles0520000420000006920to renovate4.38157150.822285Bruxelles-CapitaleBruxelles
231050Ixelles05990004160101110020to be done up4.38157150.822285Bruxelles-CapitaleBruxelles
341050Ixelles05990003160101113020good4.38157150.822285Bruxelles-CapitaleBruxelles
451050Ixelles0575000317100004620just renovated4.38157150.822285Bruxelles-CapitaleBruxelles
561050Ixelles059000042250010020to renovate4.38157150.822285Bruxelles-CapitaleBruxelles
671050Ixelles057500042091000020unknown4.38157150.822285Bruxelles-CapitaleBruxelles
781050Ixelles05950001195111161740as new4.38157150.822285Bruxelles-CapitaleBruxelles
891050Ixelles0595777425000007020unknown4.38157150.822285Bruxelles-CapitaleBruxelles
9111050Ixelles0650000625010006020good4.38157150.822285Bruxelles-CapitaleBruxelles

Last rows

df_indexpostal_codecity_nametype_of_propertypricenumber_of_roomshouse_areafully_equipped_kitchenopen_fireterracegardensurface_of_the_landnumber_of_facadesswimming_poolstate_of_the_buildinglattitudelongitudeprovinceregion
40385520634342Hognoul03990004180111168030as new5.45563950.680810LiègeWallonie
40386520644342Hognoul042500033151011030unknown5.45563950.680810LiègeWallonie
40387520657743Obigies039000043401101216440unknown3.36428150.662055HainautWallonie
40388520673050Oud-Heverlee04200005185000146500to be done up4.66789750.821768Brabant flamandFlandre
40389520683050Oud-Heverlee043500042341010030as new4.66789750.821768Brabant flamandFlandre
40390520701472Vieux-Genappe047500052161100155041as new4.40150350.629025Brabant wallonWallonie
40391520711472Vieux-Genappe047500052151010155001good4.40150350.629025Brabant wallonWallonie
40392520721461Haut-Ittre049900052751011156140unknown4.29647250.648804Brabant wallonWallonie
40393520731761Borchtlombeek04950004235100148840unknown4.13691550.848178Brabant flamandFlandre
40394520753381Kapellen048500032200010101940good4.96087850.887345Brabant flamandFlandre